Where possible, the notation dest = op (dest, src) applies to both floats residing in an MMX register; the operation is executed in parallel on them. The bold instructions were introduced on Athlon.
SIMD-FP solution | |
pi2fd | dest = float(dwordsrc) |
pf2id | dest = dword(floatsrc) |
pi2fw | dest[63..32] = float(wordsrc[47..32]) dest[32..0] = float(wordsrc[15..0]) |
pf2iw | dest[63..48] = 0 dest[47..32] = word(floatsrc[63..32]) dest[31..16] = 0 dest[15..0] = word(floatsrc[31..0]) |
pfacc | dest.hi = src.hi + src.lo, dest.lo = dest.hi + dest.lo |
pfnacc | dest.hi = src.hi - src.lo, dest.lo = dest.hi - dest.lo |
pfpnacc | dest.hi = src.hi + src.lo, dest.lo = dest.hi - dest.lo |
pfadd | dest = dest + src |
pfsub | dest = dest - src |
pfsubr | dest = src - dest |
pfcmpeq | dest = (dest == src) ? 0xFFFFFFFF : 0 |
pfcmpge | dest = (dest >= src) ? 0xFFFFFFFF : 0 |
pfcmpgt | dest = (dest > src) ? 0xFFFFFFFF : 0 |
pfmin | dest = min (dest, src) |
pfmax | dest = max (dest, src) |
pfmul | dest = dest * src |
pfrcp | dest.hi = dest.lo = approx15(1/src.lo) |
pfsqrt | dest.hi = dest.lo = approx15(1/sqrt(src.lo)) |
pfrcpit1 | first iteration of reciprocal approximation |
pfrcpit2 | second it. of reciprocal and recip. sqrt approx. |
pfrsqit1 | first it. of recip. sqrt approx. |
Extensions to MMX | |
pavgusb | dest = average (dest, src) (on unsigned bytes) |
pmulhrw | used instead of pmulhw, for fixed point math |
pswapd | dest.hi = src.lo, dest.lo = src.hi |
femms | fast empty MMX state |
prefetch | prefetch data to L1 cache |
prefetchw | on current processors does not differ from prefetch |